hurst parameter
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom (0.04)
- Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Cognitive Science (0.67)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- (3 more...)
- North America > United States (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > United Kingdom (0.04)
- Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Cognitive Science (0.67)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- (4 more...)
Parameter Estimation of Long Memory Stochastic Processes with Deep Neural Networks
Csanády, Bálint, Nagy, Lóránt, Boros, Dániel, Ivkovic, Iván, Kovács, Dávid, Tóth-Lakits, Dalma, Márkus, László, Lukács, András
We present a purely deep neural network-based approach for estimating long memory parameters of time series models that incorporate the phenomenon of long-range dependence. Parameters, such as the Hurst exponent, are critical in characterizing the long-range dependence, roughness, and self-similarity of stochastic processes. The accurate and fast estimation of these parameters holds significant importance across various scientific disciplines, including finance, physics, and engineering. We harnessed efficient process generators to provide high-quality synthetic training data, enabling the training of scale-invariant 1D Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) models. Our neural models outperform conventional statistical methods, even those augmented with neural networks. The precision, speed, consistency, and robustness of our estimators are demonstrated through experiments involving fractional Brownian motion (fBm), the Autoregressive Fractionally Integrated Moving Average (ARFIMA) process, and the fractional Ornstein-Uhlenbeck (fOU) process. We believe that our work will inspire further research in the field of stochastic process modeling and parameter estimation using deep learning techniques.
- North America > United States (0.67)
- Europe > Hungary (0.14)
- Banking & Finance > Trading (0.68)
- Information Technology > Security & Privacy (0.67)
- Energy > Oil & Gas > Upstream (0.37)
Fractal Patterns May Unravel the Intelligence in Next-Token Prediction
Alabdulmohsin, Ibrahim, Tran, Vinh Q., Dehghani, Mostafa
Self-similar processes were introduced by Kolmogorov in 1940 (Kolmogorov, 1940). The notion garnered We study the fractal structure of language, aiming considerable attention during the late 1960s, thanks to to provide a precise formalism for quantifying the extensive works of Mandelbrot and his peers (Embrechts properties that may have been previously suspected & Maejima, 2000). Broadly speaking, an object is called but not formally shown. We establish that "self-similar" if it is invariant across scales, meaning its statistical language is: (1) self-similar, exhibiting complexities or geometric properties stay consistent irrespective at all levels of granularity, with no particular of the magnification applied to it (see Figure 1). Nature characteristic context length, and (2) longrange and geometry furnish us with many such patterns, such as dependent (LRD), with a Hurst parameter coastlines, snowflakes, the Cantor set and the Kuch curve. of approximately H = 0.70 0.09. Based Despite the distinction, self-similarity is often discussed on these findings, we argue that short-term patterns/dependencies in the context of "fractals," another term popularized by in language, such as in paragraphs, Mandelbrot in his seminal book The Fractal Geometry of mirror the patterns/dependencies over Nature (Mandelbrot, 1982). However, the two concepts are larger scopes, like entire documents.
- Europe > United Kingdom (0.04)
- North America > United States > New York (0.04)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
Deep learning the Hurst parameter of linear fractional processes and assessing its reliability
Boros, Dániel, Csanády, Bálint, Ivkovic, Iván, Nagy, Lóránt, Lukács, András, Márkus, László
This research explores the reliability of deep learning, specifically Long Short-Term Memory (LSTM) networks, for estimating the Hurst parameter in fractional stochastic processes. The study focuses on three types of processes: fractional Brownian motion (fBm), fractional Ornstein-Uhlenbeck (fOU) process, and linear fractional stable motions (lfsm). The work involves a fast generation of extensive datasets for fBm and fOU to train the LSTM network on a large volume of data in a feasible time. The study analyses the accuracy of the LSTM network's Hurst parameter estimation regarding various performance measures like RMSE, MAE, MRE, and quantiles of the absolute and relative errors. It finds that LSTM outperforms the traditional statistical methods in the case of fBm and fOU processes; however, it has limited accuracy on lfsm processes. The research also delves into the implications of training length and valuation sequence length on the LSTM's performance. The methodology is applied by estimating the Hurst parameter in Li-ion battery degradation data and obtaining confidence bounds for the estimation. The study concludes that while deep learning methods show promise in parameter estimation of fractional processes, their effectiveness is contingent on the process type and the quality of training data.
- Energy > Energy Storage (0.69)
- Energy > Oil & Gas > Upstream (0.36)
On the Theoretical Properties of Noise Correlation in Stochastic Optimization
Lucchi, Aurelien, Proske, Frank, Orvieto, Antonio, Bach, Francis, Kersting, Hans
Studying the properties of stochastic noise to optimize complex non-convex functions has been an active area of research in the field of machine learning. Prior work has shown that the noise of stochastic gradient descent improves optimization by overcoming undesirable obstacles in the landscape. Moreover, injecting artificial Gaussian noise has become a popular idea to quickly escape saddle points. Indeed, in the absence of reliable gradient information, the noise is used to explore the landscape, but it is unclear what type of noise is optimal in terms of exploration ability. In order to narrow this gap in our knowledge, we study a general type of continuous-time non-Markovian process, based on fractional Brownian motion, that allows for the increments of the process to be correlated. This generalizes processes based on Brownian motion, such as the Ornstein-Uhlenbeck process. We demonstrate how to discretize such processes which gives rise to the new algorithm fPGD. This method is a generalization of the known algorithms PGD and Anti-PGD. We study the properties of fPGD both theoretically and empirically, demonstrating that it possesses exploration abilities that, in some cases, are favorable over PGD and Anti-PGD. These results open the field to novel ways to exploit noise for training machine learning models.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Europe > Norway > Eastern Norway > Oslo (0.04)
- (4 more...)
Understanding Long Range Memory Effects in Deep Neural Networks
Tan, Chengli, Zhang, Jiangshe, Liu, Junmin
\textit{Stochastic gradient descent} (SGD) is of fundamental importance in deep learning. Despite its simplicity, elucidating its efficacy remains challenging. Conventionally, the success of SGD is attributed to the \textit{stochastic gradient noise} (SGN) incurred in the training process. Based on this general consensus, SGD is frequently treated and analyzed as the Euler-Maruyama discretization of a \textit{stochastic differential equation} (SDE) driven by either Brownian or L\'evy stable motion. In this study, we argue that SGN is neither Gaussian nor stable. Instead, inspired by the long-time correlation emerging in SGN series, we propose that SGD can be viewed as a discretization of an SDE driven by \textit{fractional Brownian motion} (FBM). Accordingly, the different convergence behavior of SGD dynamics is well grounded. Moreover, the first passage time of an SDE driven by FBM is approximately derived. This indicates a lower escaping rate for a larger Hurst parameter, and thus SGD stays longer in flat minima. This happens to coincide with the well-known phenomenon that SGD favors flat minima that generalize well. Four groups of experiments are conducted to validate our conjecture, and it is demonstrated that long-range memory effects persist across various model architectures, datasets, and training strategies. Our study opens up a new perspective and may contribute to a better understanding of SGD.
- Asia > China (0.14)
- Europe > United Kingdom (0.14)